Experiments with syllable-based Zulu-English machine translation

نویسندگان

  • Friedel Wolff
  • Gideon Kotzé
چکیده

Due to morphological complexity and scarce resources, machine translation from Zulu to English is challenging. We investigate the possibility of phrase-based statistical machine translation from Zulu to English using syllables as the tokens in the Zulu source text. Initial experiments on a relatively small but multi-domain data set suggest merit in our approach, with our best syllable-based model outperforming the best word-based model by 12,90% using the BLEU evaluation measure. Our syllabification approach is largely language independent, at least within the Bantu language family, and holds promise for similar efforts in related languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring unsupervised word segmentation for machine translation in the South African context

We explore the application of unsupervised word segmentation algorithms to phrase-based statistical machine translation (SMT) systems, translating from English to four South African languages: Afrikaans, Northern Sotho, Tsonga and Zulu. Positive results in terms of the standard BLEU and NIST scores are obtained for systems translating into Afrikaans and Zulu.

متن کامل

A Hybrid Approach of English- Hindi Named-entity Transliteration

In recent years, machine transliteration has gained a center of attention for research. Both machine translation and transliteration are important for e-governance and web based online multilingual applications. As machine translation translate source language to target language which results in wrong translation for named entities. Named entities are required to be translated with preserving t...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

A Comparative Study of English-Persian Translation of Neural Google Translation

Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...

متن کامل

Generating Phonetic Cognates to Handle Named Entities in English-Chinese Cross-Language Spoken Document Retrieval

We have developed a technique for automatic transliteration of named entities for English-Chinese cross-language spoken document retrieval (CL-SDR). Our retrieval system integrates machine translation, speech recognition and information retrieval technologies. An English news story forms a textual query that is automatically translated into Chinese words, which are mapped into Mandarin syllable...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014